8 research outputs found

    EFFICIENT APPROACH FOR VIEW SELECTION FOR DATA WAREHOUSE USING TREE MINING AND EVOLUTIONARY COMPUTATION

    Get PDF
    Selection of a proper set of views to materialize plays an important role indatabase performance. There are many methods of view selection which uses different techniques and frameworks to select an efficient set of views for materialization. In this paper, we present a new efficient, scalable method for view selection under the given storage constraints using a tree mining approach and evolutionary optimization. Tree mining algorithm is designed to determine the exact frequency of (sub)queries in the historical SQL dataset. Query Cost model achieves the objective of maximizing the performance benefits from the final view set which is derived from the frequent view set given by tree mining algorithm. Performance benefit of a query is defined as a function of queryfrequency, query creation cost, and query maintenance cost. The experimental results shows that the proposed method is successful in recommending a solution which is fairly close to optimal solution

    Probabilistic Page Replacement Policy in Buffer Cache Management for Flash-Based Cloud Databases

    Get PDF
    In the fast evolution of storage systems, the newly emerged flash memory-based Solid State Drives (SSDs) are becoming an important part of the computer storage hierarchy. Amongst the several advantages of flash-based SSDs, high read performance, and low power consumption are of primary importance. Amongst its few disadvantages, its asymmetric I/O latencies for read, write and erase operations are the most crucial for overall performance. In this paper, we proposed two novel probabilistic adaptive algorithms that compute the future probability of reference based on recency, frequency, and periodicity of past page references. The page replacement is performed by considering the probability of reference of cached pages as well as asymmetric read-write-erase properties of flash devices. The experimental results show that our proposed method is successful in minimizing the performance overheads of flash-based systems as well as in maintaining the good hit ratio. The results also justify the utility of a genetic algorithm in maximizing the overall performance gains

    Mining Query Plans for Finding Candidate Queries and Sub-Queries for Materialized Views in BI Systems Without Cube Generation

    Get PDF
    Materialized views are important for optimizing Business Intelligence (BI) systems when they are designed without data cubes. Selecting candidate queries from large number of queries for materialized views is a challenging task. Most of the work done in the past involves finding out frequent queries from the past workload and creating materialized views from such queries by either manually analyzing workload or using approximate string matching algorithms using query text. Most of the existing methods suggest complete queries but ignore query components such as sub queries for creation of materialized views. This paper presents a novel method to determine on which queries and query components materialized views can be created to optimize aggregate and join queries by mining database of query execution plans which are in the form of binary trees. The proposed algorithm showed significant improvement in terms of more number of optimized queries because it is using the execution plan tree of the query as a basis of selection of query to be optimized using materialized views rather than choosing query text which is used by traditional methods. For selecting a correct set of queries to be optimized using materialized views, the paper proposes efficient specialized frequent tree component mining algorithm with novel heuristics to prune search space. These frequent components are used to determine the possible set of candidate queries for creation of materialized views. Experimentation on standard, real and synthetic data sets, and also the theoretical basis, proved that the proposed method is able to optimize a large number of queries with less number of materialized views and showed a significant improvement in performance compared to traditional methods

    NoSQL Databases: Modern Data Systems for Big Data Analytics - Features, Categorization and Comparison

    Get PDF
    Because of the massive utilization of the world wide web and the drastic use of electronic gadgets to access the online world, there is an exponential growth in the information produced by these hardware gadgets. The data produced by different sources, such as smart transportation, healthcare, and e-commerce, are large, complex, and heterogeneous. Therefore, storing and querying this data, coined "Big Data," is challenging. This paper compares relational databases with a few of the popular NoSQL databases. The performance of various databases in executing join queries, filter queries, and aggregate queries on large datasets are compared on a single node and multinode clusters. The experimental results demonstrate the suitability of NoSQL databases for Big Data Analytics and for supporting large userbase interactive web applications

    Query Optimization in OODBMS using Query Decomposition & Query Caching

    No full text
    Query optimization is of great importance for the performance of databases, especially for the execution of complex query statements. A query optimizer determines the best strategy for performing each query. These decisions have a tremendous effect on quer y performance, and query optimization is a key technology for every application, from operational systems to data warehouse and analysis systems to content - management systems. For example, query optimizers transform query statements, so that these complex statements can be transformed into semantically equivalent, but better performing, query statements. The query optimizer chooses, for example, whether or not to use indexes for a given query, and which join techniques to use when joining multiple tables. Query optimizers are typically cost - based. In a cost - based optimization strategy, multiple execution plans are generated for a given query, and then an estimated cost is computed for each plan. The query optimizer chooses the plan with the lowest estimate d cost. This report is based on relatively newer approach for query optimization in object databases, which uses query decomposition and cached query results to improve execution times for a query. Multiple experiments were performed to prove the productivity of this newer way of optimizing a query . The limitation of this technique is that its useful especially in scenarios where data manipulation rate is very low as compared to data retrieval rate
    corecore